---
title: "OllamaGenerator"
id: ollamagenerator
slug: "/ollamagenerator"
description: "A component that provides an interface to generate text using an LLM running on Ollama."
---

# OllamaGenerator

A component that provides an interface to generate text using an LLM running on Ollama.

<div className="key-value-table">

|  |  |
| --- | --- |
| **Most common position in a pipeline** | After a [`PromptBuilder`](../builders/promptbuilder.mdx) |
| **Mandatory run variables** | `prompt`: A string containing the prompt for the LLM |
| **Output variables** | `replies`: A list of strings with all the replies generated by the LLM  <br /> <br />`meta`: A list of dictionaries with the metadata associated with each reply, such as token count and others |
| **API reference** | [Ollama](/reference/integrations-ollama) |
| **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/ollama |

</div>

## Overview

`OllamaGenerator` provides an interface to generate text using an LLM running on Ollama.

`OllamaGenerator`  needs a `model`  name and a `url` to work. By default, it uses `"orca-mini"` model and `"http://localhost:11434"` url.

[Ollama](https://github.com/jmorganca/ollama) is a project focused on running LLMs locally. Internally, it uses the quantized GGUF format by default. This means it is possible to run LLMs on standard machines (even without GPUs) without having to go through complex installation procedures.

### Streaming

This Generator supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) the tokens from the LLM directly in output. To do so, pass a function to the `streaming_callback` init parameter.

## Usage

1. You need a running instance of Ollama. You can find the installation instructions [here](https://github.com/jmorganca/ollama).
   A fast way to run Ollama is using Docker:

```shell
docker run -d -p 11434:11434 --name ollama ollama/ollama:latest
```

2. You need to download or pull the desired LLM. The model library is available on the [Ollama website](https://ollama.ai/library).
   If you are using Docker, you can, for example, pull the Zephyr model:

```shell
docker exec ollama ollama pull zephyr
```

If you have already installed Ollama in your system, you can execute:

```shell
ollama pull zephyr
```

:::tip Choose a specific version of a model

You can also specify a tag to choose a specific (quantized) version of your model. The available tags are shown in the model card of the Ollama models library. This is an [example](https://ollama.ai/library/zephyr/tags) for Zephyr.
In this case, simply run

```shell
# ollama pull model:tag
ollama pull zephyr:7b-alpha-q3_K_S
```
:::

3. You also need to install the `ollama-haystack` package:

```shell
pip install ollama-haystack
```

### On its own

Here's how the `OllamaGenerator` would work just on its own:

```python
from haystack_integrations.components.generators.ollama import OllamaGenerator

generator = OllamaGenerator(model="zephyr",
                            url = "http://localhost:11434",
                            generation_kwargs={
                              "num_predict": 100,
                              "temperature": 0.9,
                              })

print(generator.run("Who is the best American actor?"))

## {'replies': ['I do not have the ability to form opinions or preferences.
## However, some of the most acclaimed american actors in recent years include
## denzel washington, tom hanks, leonardo dicaprio, matthew mcconaughey...'],
## 'meta': [{'model': 'zephyr', ...}]}
```

### In a Pipeline

```python
from haystack_integrations.components.generators.ollama import OllamaGenerator

from haystack import Pipeline, Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore

template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{ query }}?
"""

docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="I really like summer"),
                          Document(content="My favorite sport is soccer"),
                          Document(content="I don't like reading sci-fi books"),
                          Document(content="I don't like crowded places"),])

generator = OllamaGenerator(model="zephyr",
                            url = "http://localhost:11434",
                            generation_kwargs={
                              "num_predict": 100,
                              "temperature": 0.9,
                              })

pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

result = pipe.run({"prompt_builder": {"query": query},
									"retriever": {"query": query}})

print(result)

## {'llm': {'replies': ['Based on the provided context, it seems that you enjoy
## soccer and summer. Unfortunately, there is no direct information given about
## what else you enjoy...'],
## 'meta': [{'model': 'zephyr', ...]}}
```
